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2^ (57) Abstract: Communication systems which employ multiple transmit and receive antenna-element arrays. Data streams for 
transmission may be interleaved among the transmit antenna elements in order to reduce decision errors. Turbo processing of equal- 
izer output from a number of layers in a layered space-time processing architecture may be employed to reduce decision errors. 
Additionally, space*time equalization may be performed to maximize signal to noise ratio such as via minimum mean square error 

^ processing, rather than zero forcing, in order to achieve the Shannon limit, reduce multi-path effects and/or reduce intersymbol in- 

^ teiference. Moreover, the receiver can select number and/or identity of receive antenna elements from among a larger group in order 

^ to optimize performance of the system. 
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TURBO DETECTION OF SPACE-TIME CODES 



RELATED APPLICATIONS 

This document claims priority to and incorporates by reference 
copending provisional USSN 60 / 152.982 entitled "Turbo Space-Time 
Processing to Improve Wireless Channel Capacity" filed on September 9. 
1999. 

FIELD OF INVENTION 

The present Invention relates to systems and processes for radio 
communications using multiple-element antenna array technology. 

BACKGROUND 

Turbo processing and space-time equalization are temis that 
comprehend several conventional ways to increase wireless channel capacity. 
Generally, turbo coding and/or processing refers to techniques aimed at 
approaching the Shannon limit in a channel, while space-time processing 
refers to techniques for processing signals from multi-element antenna arays 
to exploit the multi-path nature of fading wireless environments. 

European patent application no. EP 817 401 A2 published July 1, 1998 
in the name of Foschini discloses the use of a number of processing layers for 
space time processing of signals from multiple-receiver antenna elements. 
There, the transmitter feeds a number of transmitter antenna elements by 
cyclically apportioning segments of the modulated encoded stream of data to 
transmitter antenna elements. At the receiver, a number of receiver antenna 
elements are coupled to a number of processing layers, in order to perform 
the space-time processing. Signal components received during respective 
periods of time over a plurality of the receive antenna elements are fomied 
into respective space and time relationships in which space is associated with 
respective transmitter antenna elements. Preprocessing occurs so that a 
collection of signal components having the same space-time relationship 



forms a signal vector such that particular decoded signal contributions can be 
subtracted from the signal vector while particular undecoded contributions can 
be nulled out of the signal vector. The resulting vector is then supplied to a 
decoder for decoding to reform the data stream. Such conventional systems 
and techniques are further described in documenti f^ferred to in the "Detailed 
Description" section of this document 

SUMMARY OF THE INVENTION 
Systems and processes according to the present invention employ a 
number of transmitter antenna elements and a number of receiver antenna 
elements coupled to multiple space-time processing layers in the receiver. In 
the present invention, however, portions of the information stream being 
communicated can be interleaved among transmitter antenna elements such 
as on a random or pseudo random basis; among other things, such 
Interleaving decreases decision errors in the space-time equalization process. 
Furtherrnore, each processing layer preferably includes turbo processing in 
order to feed soft decisions about information being processed back to the 
equalizers. Moreover, space-time equalization processes according to the 
present invention preferably seek to maximize signal to noise ratio rather than 
zero forcing, as well as reduce multi-path effects and Intersymbol interference. 
A prefen-ed process uses minimum mean square error processing which 
allows the Shannon limit actually to be achieved. Furthermore, systems and 
processes according to the present invention preferably allow selection of the 
number and identity of receiver antenna elements to which the receiver may 
be coupled in order to optimize perfomiance. 

According to one embodiment of the Invention, an information source is 
coupled to provide a plurality of data streams to a plurality of transmit 
antennas, via, for each stream, an encoder, interleaver and symbol mapper. 
On the receiver side, a plurality of M receiver elemer<ts are coupled to a 
plurality of processing layers. The number of receiver antenna elements M is 
preferably greater than or equal to the number N of transmit antenna 
elements, since equalization according to the present invention does not 
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require an extra degree of freedom. The M receiver antenna elements are 1 

coupled to the first processing layer by coupling to a space-time equalizer j 

which preferably applies minimum mean square error processing in order to j 

maximize signal to noise ratio. The output of the equalizer is applied to a 

deinterleaver, after which the deinterleaved stream is supplied to a decoder in 

the layer. Output of the decoder is provided for output common with the 

output from the other decoders in the other layers. Preferably, each layer also 

includes an Interleaver which receives output from the decoder and I 

deinterleaver and supplies its interleaved output back to the equalizer in the 

layer in order to provide soft decision making to the equalizer. In successive 

processing layers, output from the decoder of the preceding layer is combined 

with infomiation from the interference canceler of the layer preceding the 

preceding layer (except the second layer, which receives signals from an 

interference canceller which is coupled to the decoder of the first layer and to 

the receive antenna elements). 

According to an alternate embodiment, the deinterleaver. interleaver 
and decoder are shared among layers, so that the equalizer of each layer 
outputs to a deinterleaver common to all layers. The output of the 
deinterleaver may then be coupled to a decoder which again is common to all 
layers. An interleaver may be provided which receives output from the 
deinterleaver and the decoder and applies it to each equalizer for soft 
decisions to be applied to the equalizers. 

Accordingly, components for deinterleaving, decoding and 
reinterleaving may be functionally located in each layer, or common to the 
layers. In the first case, each layer below the first layer processes signals 
from an interference canceller v\^ich receives signals from a decoder in the 
preceding layer and from the antenna elements (in the case of the second 
layer) or the interference canceller in the next-preceding layer (in the case of 
other layers). In the second case, each layer below the first processes 
signals from an interference canceller which receives signals from the 
equalizer in the preceding layer and from the antenna elements (in the case of 



-3- 



the second layer) or the interference canceller in the next preceding layer (in 
the case of other layers). Such turbo processing architectures can be used in 
connection with layered space-time equalization which relies on zero forcing 
rather than minimum square error processing. They can also be used in multi 
array systems in which the data streams are periodically cycled rather than 
interleaved. 

It is accordingly an object of the present invention to provide improved 
layer space-time processing for communication systems which employ turbo 
processing techniques in order, among other things, to reduce decision en-ors. 

It is an additional object of the present invention to provide layered 
space-time processing for communication systems which seeks to maximize 
signal to noise ratio, thereby better addressing the Shannon limit, and which 
also addresses mulit-path effects and / or intersymbol interference. 

It is an additional object of tiie present invention to provide processing 
for communication systems in which data streams may be interleaved rather 
than periodically cycled among transmit antenna elements. In orxler. among 
other things, to reduce decision errors. 

It is an additional object of the present invention to provide layered 
space-time processing for communication systems in which a receiver can 
select a set of antenna elements, including their number and / or Identity, from 
among a larger group of antenna elemente in order to optimize perfonnance 
of the system. 

Other objects, features, and advantages of present invention will 
become apparent with respect to the remainder of tiiis document 
BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1(a) is a schematic diagram showing a first embodiment of 
communications systems according to the present invention. 

Fig. 1(b) is a schematic diagram showing a second embodiment of 
communications systems according to the present invention. 

Fig. 2 is schematic diagram showing one form of space-time 
processing according to the present invention. 
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Figs. 3(a) and 3(b) are diagrams which compare performance between 
two coding schemes according to the present invention. 

Fig. 4 is a diagram which shows different capacity bounds for 
processing according to the present invention over a flat Rayleigh fading 
channel. 

Fig, 5 is a diagram which shows simulation results for a system 
according to the first embodiment of the present invention with two transmit 
and two receive antenna elements. 

Fig. 6 is a diagram which shows simulation results for a system 
according to the first embodiment of the present invention with two transmit 
and four receive antenna elements. 

Fig. 7 is a diagram which shows simulation results for a system 
according to the first embodiment of the present invention with two, four and 
eight receive antenna elements, using soft decisions and six turbo iterations. 

Fig. 8 is a diagram which shows simulation results for a system 
according to the second embodiment of the present invention with two. four 
and eight receive antenna elements, using soft decisions and two turbo 
iterations. 

Figs. 9(a) and 9(b) are diagrams which show simulated performance of 
the first embodiment of the present invention using soft decisions and four 
antenna elements for typical urban and hilly terrain profiles. 

DETAILED DESCRIPTION 

The documents and references cited In the following disclosure are 
incorporated herein by this reference. 



Abstract~^By deriving a generalized Shannon capacity formula 
for multiple-input, multiple-output Rayleigh fading channels, and 
by suggesting a layered space-time architecture concept that at- 
tains a tight lower bound on the capacity achievable, Foschini has 
shown a potential enormous increase in the information capacity 
of a wireless system employing multiple-element antenna arrays 
at both the transmitter and receiver. The layered space-time 
architecture allows signal processing complexity to grow linearly, 
rather than exponentially, with the promised capacity increase. 
This paper includes two important contributions: First, we show 
that Foschini' s lower bound is, in fact, the Shannon bound when the 
output signai-to-noise ratio (SNR) of the space-time processing in 
each layer is represented by the corresponding ''matched filter" 
bound. This proves the optimality of the layered space-time 
concept. Second, we present an embodiment of this concept 
for a coded system operating at a low average SNR and in the 
presence of possible intersymbol interference. This embodiment 
utilizes the already advanced space-time filtering, coding and 
turbo processing techniques to provide yet a practical solution 
to the processing needed. Performance results arc provided for 
quasi-static Rayleigh fading channels with no channel estimation 
errors. We see for the first time that the Shannon capacity for 
wireless communications can be both increased by N times (where 
N is the number of the antenna elements at the transmitter 
and receiver) and achieved within about 3 dB in average SNR, 
about 2 dB of which is a loss due to the practical coding scheme 
we assume — the layered space*time processing itself is nearly 
information-lossless! 

Index Terms — Equalization, interference suppression, space- 
time processing, turbo processing. 



I. IhrrRODUCTION 

TURBO" and **space-time" are two of the most explored 
concepts in modem-day communication theory and 
wireless research. From a communication theorist's viewpoint, 
"turbo" coding/processing is a way to approach the Shannon 
limit on channel capacity, while "space-time" processing Is 
a way to increase the possible capacity by exploiting the rich 
multipath nature of fading wireless environments. We will sec 
through a specific embodiment in this paper that combining the 
two concepts provides even a practical way to both increase 
and approach the possible wireless channel capacity. 

With growing bit rate demand in wireless communications, 
it is especially imponant to use the spectral resource efficiently. 

Paper approved by K. B. Letaief. ihc Editor for Wireless Systems of the IEEE 
Communication^ Society. Manuscript received September 15. 1999; revised De- 
cember J. 1999. This paper was presented at the IEEE Intemaiional Conference 
on Ccmvnuntcaiions. New Orleans. LA. June 2000. 

The author is with the Home Wireless Networks. Norcross. CA 3007 1 USA 
f e-mail: lek@homewirekss.com t. 

Publisher Item Idemitier S 0090-6778(00)07 111-7. 



The basic information theory results reported by Foschini and 
Gans (!) have promised extremely high spectral efficiencies 
possible through muhiple-element antenna array technology. 
In high scattering wireless environments (e.g., troposcatter» 
cellular, and indoor radio), the use of multiple spatially sepa- 
rated and/or differendy polarized antennas at the receiver has 
been very effective in providing diversity against fading [2]. 
[3], Receiver diversity techniques also create signal processing 
opportunities for interference suppression and equalization 
(e g- [4H6]). However, using multiple antennas at either the 
transmitter or the receiver does not enable a significant gain in 
the possible channel capacity. According to [I J. the Shannon 
capacity for a system with I transmit and yV receive antennas 
scales only logarithmically with N, as ;V ^c. For a system 
using :V transmit and I receive anteiuias, asymptotically 
there is no additional capacity to be gained, assuming that the 
transmit power is divided equally among the /V antennas. 

Foschini and Gans [1] have shown that the asymptotic 
capacity of multiple-input, multiple-output (MIMO) Rayleigh 
fading channels grows, instead, linearly with :V when N 
antennas are used at both the transmitter and the receiver. 
Furthermore, in [7], Foschini suggested a layered space-<ime 
architecture concept that can attain a tight lower bound on the 
capacity achievable. In this layered space-time architecture, 
:V information bit streams are transmitted simultaneously 
(in the same frequency band) using ;V diversity antennas. 
The receiver uses another N diversity antennas to decouple 
and detect the /V transmitted signals, one signal at a time. 
The decoupling process in each of the N processing "layers" 
involves a combination of nulling out the interference from 
yet undetected signals (/V diversity antennas can null up to 
xV - 1 imerfercrs, regardless of the angles -of-arrival [5]) and 
canceling out the interference from already detected signals. 
One very significant aspect of this architecture is that it 
allows an iV-dimensional signal processing problem — which 
would otherwise be solvable only through multiuser detection 
methods [8] with m*^ complexity (m is the signal constellation 
size) — to be solved with only /V similar 1-D processing steps. 
Namely, the processing complexity grows only linearly with 
the promised capacity. 

This paper includes two important conuibuiions. First, we 
show that Foschini *s lower bound is. in fact, the Shannon 
bound when the output SNR of the space-time processing in 
each layer is represented by the corresponding "matched filler" 
bound (61. i.e.. the maximum SNR achievable in a hypothetical 
situation where the array processing weights to suppress the 
remaining interference in each layer are chosen to maximize the 
output signal-to-interference-plus-noise ratio and any possible 
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intersymbol interference (IS!) is assumed to be compteiely 
eliminated by some means of equaiizaiion. The '^maiched filter" 
bound has been shown to be approachable using minimum 
mean-square error (MMSE) space-time filtering techniques 
(61.' By showing the equivalence of the generalized Foschini's 
bound and the Shannon bound, wc essentially prove the 
optimal ity of the layered space-time concept. 

Second, we present an embodiment of Foschini's layered 
space-time concept for a coded system operating at a low av- 
erage SNR and in the presence of unavoidable ISI. Previously, 
a different embodiment has been provided in [9] for an uncoded 
system with variable signal constellation sizes, operating at 
a high average SNfR without ISI. Adding coding redundancy 
might, at first, seem conflicting widi the desire to increase the 
channel bit rate. Our justification is as follows: Fint. we seek 
to enhance the chaimel capacity from a system perspective. 
Wc use "noise" in S^fR to represent ail system impairments, 
including thermal noise and multiuser interference. The ability 
to operate at low S^fR*s means that more users per unit area 
can occupy the same bandwidth simultaneously. Second, we 
anticipate the use of adaptive-rate coding schemes to permit 
different degrees of error protection according to the channel 
SNR's. Incremental redundancy transmission [10]. cunemly 
being , considered for the Enhance Data Services for GSM 
Evolution (EDGE: GSM stands for Global System for Mobile 
Communications) standard, is an efficient way to implement 
adaptive code rates without requiring channel SNR monitoring. 
With such adaptive-rate coding, the system does not waste** 
spectral resources under good channel conditions. 

Meanwhile, the iterative processing principle used in turbo 
and serial concatenated coding [1 1]-[15] has been successfully 
applied to a wide variety of joint detection and decoding prob- 
lems. One such application is the so-called "turbo equalization" 
[16H19). where successive maximum a posteriori (MAP) 
processing is performed by the equalizer and charmel decoder 
to provide a priori information about the transmit sequence 
to one another. Similar to the layered space-time concept, 
turbo processing allows a multi-dimensional (m^o-dimensional 
in this case) problem to be optimally solved with successive 
1-D processing steps without much performance penalty. In 
. this paper, we apply the turbo principle to layered space-time 
processing in order to prevent decision errors produced in each 
layer from catastrophically affecting the signal detection in 
subsequent layers. 

We consider two possible coded layered space-time struc- 
tures: one applying coding across the multiple signal processing 
layers, and the other assuming independent coding within each 
layer. Similar to [ 1 1. we assume a quasi-static random Rayleigh 
channel model, where the channel characteristics are stationary 
within each data block, but statistically independent between 
different data blocks, different antennas, and. in the case of dis- 
persive multipath channels, different paths. The system is as- 
sumed to have similar ISl situations as in EDGE and GSM. 
where multipath dispersions may last up to several symbol pe- 
riods (20|. We show that near-capacity performance is achiev- 

'tn J flat fading case. .MMSE array procc%Mne achieves exacily Che "matched 
filler" bound peit'ormance. 
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able using 1-D processing and coding techniques that are al- 
ready practical and "legacy-compatible** with the EDGE stan- 
dard, e.g., the use of bit-interleaved 8-ary phase-shift keying 
(8-PSK) with rate- 1/3 convoiutional coding and an equalizer 
with a similar length and structure. 

A slightly different layered space-time approach based on 
space-time coding [23]. [24 1 has been studied in (25|. Although 
it is difficult to make a general comparison, we will see later that 
our coded layered space-time approach does by far outperform 
the results reponed in [25] for N = A md N = 8. On the 
other hand, for iV = 2, space-time coded quaternary phase-shift 
keying (QPSK) without layered processing appean to be the 
best known technique for achieving a spectral efficiency of 2 
bps/Hz. 

This paper is organized as follows. Section II provides a 
brief review of Foschini's layered space-time concept. Section 
III describes the two coded layered space-time architectures 
and presents a capacity analysis which reveals the equivalence 
of a generalized Foschini's lower bound formula and the true 
capacity bound. Section IV provides details on the array pro- 
cessing, equalization, and iterative MAP techniques. Section V 
presents performance results. A summary and conclusions are 
given in Section VI. 

II. Background Theory 

We briefly review the theory behind Foschini's layered 
space-time concept The generalized Shannon capacity for a 
MIMO Rayleigh fading system with N transmit and M receive 
antennas is given in [1] as 

C=log3[det(/-h-^tf^^)] (1) 

where ^ is an A/ x matrix, the (i. j)ih element of which 
is the normalized channel transfer function of the transmission 
link between the jth transmit antenna and the ith receive an- 
tenna. / is the iVf X M identity matrix, p is the average SNR 
per receive antenna, and det( )and superscript f denote deter- 
minant and conjugate transpose. It is assumed that the transmit 
power is equally divided among the /V transmit antennas. The 
normalization of the chaimel transfer function is done such that 
the average (over Rayleigh fading) of its squared magnitude is 
equal to unity. 

The lower bound on capacity is provided in [1] as 

.V 

C> 52 los, [l + xk] = Cr (2) 

where x jk * chi-squarcd random variable with 2A: degrees of 
freedom. For M — *V 

.V 

Since xlk represents a fading channel wiih a diversity order 
of k, the lower-bound capacity in (3) con be viewed as the sum 
of the capacities of .V independent channels with increasing di- 
versity orders from I to .V. This suggests a layered space-time 
approach (7) for detecting the .V transmiued signals as follows: 



In the first layer, ihe receiver detects a first transmitted signal 
by nulling out interference from N ~ I other iransmiued signals 
through array processing. Assuming a "zero forcing" (ZF) con- 
straint, one receive antenna is needed to completely correlate 
and subtract each interference [5]. Thus, the overall process of 
nulling :V - 1 interferences leaves the receiver with :V - (.V - 
I ) = 1 degree of freedom to provide diversity for detecting the 
first signal, i.e.. a diversity order of 1 (or simply no diversity). 
Once detected, the fu^t signal is subtracted out from the received 
signals on all N antennas. 

In the second layer, the receiver perfonns similar interference 
nulling to detect a second transmitted signal. This time, since 
there arc only .V - 2 remaining interferences, the receiver af- 
fords a diversity order of 2. The detected signal is again sub- 
tracted out from the received signals provided by the first layer. 

Repeating the above interference nulling/canceling seep 
through .Y layers, we see thai the receiver aifords an increasing 
order of diversity from 1 to N. If the capacities achieved in 
individual layers can be combined in some manner, then the 
layered space-time approach just mentioned will achieve the 
capacity lower bound expressed in (3). We will explore two 
capacity combining possibilities in die next section. 

Note that the capacity and capacity low bound given in 
(IM3) are actually frequency-dependent. We here provide an 
explicit capacity fomiula for band-limited, frequency'Seieaive 
channels (some variables are redefined to be consistent with 
later analytical development). 

C = (iog2[det(»H-Ml) (4) 

where, as shown in equations (5H8) at the bottom of the page. 
92 is the frequency-domain correlation matrix of the signals 
on M receive antennas. is the noise power density at 

frequency / on die ;th receive antenna. T is the symbol period. 



H,j(f) is the channel transfer function (not nonnalizcd) of the 
transmission (ink between the ith transmit antenna and the 
jih receive antenna, and superscripts * and T denote complex 
conjugate and transpose. Note in (7) and (8) that we consider 
the folded spectra - (m/T)) and Hjif - (m/T)) of 

the channel transfer function and noise power density, where 
m = -7. .... J (J is fmiic because the signal sources are 
assumed to be band -limited). This is to take into account the 
effect of excess bandwidth and symbol-rate sampling when 
the frequency selectivity of the channel is not symmetrical 
around the Nyquist band edges. Even though we assume white 
Gaussian noise, the noise power density near and outside the 
Nyquist band edges actually attenuates with the receive filter 
transfer function. From our experiment (assuming a square-root 
Nyquist filter with a 50% roUoff factor), the computed capacity 
can be underestimated by as much as 0.5 dB if this attenuation 
is not taken into account. 

in. Coded Layered Space-Time ARCHTTHmrRES 

A. Basic Concepts 

We consider two coded layered space-time approaches as 
shown in Fig. 1(a) and (b). In the first approach, named **LST*r* 
(LST stands for "layered space-cime* ). the coded infomiation 
bits are interleaved across the /V parallel data streams xi. 
X2. * x/v* where x, denotes a sequence of complex- valued, 
transmit data symbols (e.g., 8-PSK symbols). The receiver first 
decouples the «V dau streams through interference nullingA:an- 
cellation, as described in Section II. dien deinterieaves and 
decodes all tht N data streams as one information block. In 
the second approach, "LST-U." the information is first divided 
into iV uncoded bit sequences u\, U2» . . us* each of which 
is independently encoded, interleaved, and symbol-mapped to 
generate one of the /V parallel data streams. At the receiver, die 
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Fig. L Coded layered space-<ime architecture: U) LST-l and (b) LST-II. 



.V data streams are decoupled and independently deinterleaved 
and decoded. The output of LST-II produces N infoimadon 
bloclcs at a rate of l/;V times the output rate of LST-L 

In Fig. 1(a) and (b), "space-time equaiizef ' refers to a com- 
bined array processing (for interference nulling) and cqualiza- 
tion function. Instead of the ZF criterion, we assume that the op- 
timization of the antenna/equalizer weights is based on a MMSE 
criterion, which in general provides belter performance than a 
ZF approach. Foschini [7] has also indicated a potential per- 
formance benefit of using MMSE (or 'maximum S^fR") rather 
than ZF in a layered space-<ime architecture. Although we show 
M receive antennas in Fig. 1(a) and (b) {M > .V is the suffi- 
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ciem condition for nulling iV - 1 interference ), we only consider 
M = iV in this study. 

Simitar to (9|, the underlying assumption of our layered 
space-time architecture is that the receiver can order the detec- 
tions of N data streams such that an undetected layer always 
has the strongest received SNR. In LST-l. the space-time 
equalizer in each layer must provide dau decisions Xa(i) (A de- 
notes the permutation due to layer ordering) to the interference 
canceller, since decoding cannot be done until all the layers 
arc processed. In LST-ll. the interference cancellation in each 
layer can use more reliable data decisions xixa) provided by 
the decoder. Thus. LST-I is more prone to decision errors than 



LST-O. In order to mininiize the effects of decision errors, and 
also to improve the joint detection/decoding performance in 
general, we assume the use of turbo processing in our layered 
space-time architecture. As shown in Fig. 1(a) and (b), the 
space-time equalizers and the decoders provide extrinsic soft 
information to one another by subtracting the received soft 
information from the newly computed soft information. Details 
on MMSE space-time equalization and turbo processing will 
be provided in the Section IV. 

B. Capacity Analysis 

Without getting into the detail of all the processing functions, 
we first discuss the general differences between the two coded 
layered space-time approaches. In particular* we are most inter- 
ested in the capacity combining aspects of the two approaches. 

Lei SNRfc denote the output SNR of the array processing in 
the A:th layer. First, we note that, in LST-II. the capacity of each 
processing layer is bounded by the spectral efficiency R of the 
modulation and coding in each layer, e.g., R - i for 8-PSK 
with rate 1/3 coding. Thus, die total capacity of LST-II is given 
by (similar to (IH3). we write capacity without showing the 
frequency dependence) 



Clst-ii = 5Z inin{/2, logjll + SNR*)}. 



(9) 



Without layer ordering, it is most likely that the overall perfor- 
mance of LST-II will be largely influenced by the error proba- 
bility of the first processing layer with a diversity order of only 
1 . In contrast, our simulation results in Section V will show that 
LST-II with layer ordering can actually achieve a diversity order 
of approximately N. 

Since coding is performed across all the processing layers in 
LST-I, the achieved SNR in each layer will contribute to the 
overall layer processing perfonnance. As Foschini [7] indicated, 
such a coding scheme should be able to achieve the capacity 
lower bound in (3). Here, we provide a generalized formula of 
Foschini *s lower bound by removing the ZF constraint and in- 
stead using SNRfc as the generalized outpuf SNR. 



^F = E^^e2(l + SNRfc). 



(10) 



Reference [6) provides output SNR formulas for different 
types of optimum space-time processors. Here, it is of great 
interest to express the capacity lower bound using the best per- 
formance achievable. In the following equation, we represent 
SNRfc in (10) by the "matched filter" bound-the maximum 
achievable SNR by any space-time processing receiver: 



Or 



MF 



= ^Eiog2(i + rfc(/)j^ 



(11) 



^whcre Vkif) is the "matched filter" bound^ given by equation 
(15) in Section IV-A (simply a rewriting of the resuh in [6|). 

'Noie thji chc **matched filter'* bound u.suaJty refen to the integrated SNR 
(Viifi) over the siigiui bandwidth (e.g.. (6|). However, in the capacity concexi. 
we assume the best possible way lo exploii the SNR's in all frequency compo- 
nents 
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Fig. 2. Space-time DDFSE with MAP processing 

Note that (I I) is an explicit formula similar to (4); it shows the 
frequency dependence of the output SNR and the integration of 
capacity over the signal bandwidth. Also, we assume that the 
A:th layer has A; - I interferences. 

In the process of analyzing the meaning of (1 1), we discov- 
ered an identical relationship between ( 1 1 ) and (4) regardless of 
how the layers are ordered. We show the proof in the Appendix 
(this proof is valid even when A/ .V). Thus. Foschini' s lower 
bound (3) is actually the true Shannon capacity bound when 
the output SNR of the space-time processing in each layer is 
represented by the corresponding 'matched filter' bound. This 
proves the optimality of the layered space-time concept. 

The capacity analysis presented above is based on the as- 
sumption of perfect layer detection, i.e., no decision errors af- 
fecting the detection in subsequent layers. In reality, LST-I is 
more prone to decision errors than LST-U and layer ordering 
becomes important for both schemes. Our simulation results in 
Section V will demonstrate how decision errors affect the actual 
performance of the two coded layered space-time approaches. 

IV. Signal Processing Functions 
A. Space-Time Equalization 

We consider combined array processing and equalization in 
order to cope with dispersive channels. A space-time equalizer, 
consisting of a spatial/temporal whitening filter, followed by 
a decision-feedback equalizer (DE^) or maximum-likelihood 
sequence estimator, can suppress both IS I and dispersive 
interference [6]. The space-time equalizer used in this study 
is shown in Fig. 2. It consists of a linear feedforward filter 

i = 1 r -V/. on each diversity branch, a combiner, 

symbol-rate sampler, soft-input, soft-output (SISO) MAP 
sequence estimator, and synchronous linear feedback filter 
/?'(/). The feedforward filters {^Vj{J)\ are shown as con- 
tinuous-time filters, but they can be implemented in practice 
using fractionally-spaced tapped delay lines. The combined 
use of a sequence estimator and feedback filter after diversity 
combining is similar to the structure of a delayed decision-feed- 
back sequence estimator (DDFSE) (27). Thus, we refer to the 
space-time equalizer in Fig. 2 as a "space-time DDFSE." A 
"space-time DFE" is a structure where the sequence estimator 
is replaced by a memoryless hard sheer. 

it has been shown in [20| that a space-time DDFSE with a 
sequence estimator memory of /t and a feedback filter of length 
Lq - can be optimized in a MMSE manner as if it was a 
space-time DFE with a feedback filter of length Lq- iti fact. 



numerical results in [6] showed ihac an optimum space-time 
DFE (with unconstrained filter lengths and no feedback decision 
errors) can perform within oniy 1-2 dB of the ideal 'matched 
filter" bound performance. Thus, in order to have a practical 
receiver scnicture for layered space-time processing, we con- 
sider a space-time DDF5E with a minimum sequence estimator 
memory, i.e., ^ = i. The sequence estimator is used only to 
provide a trellis structure needed for rurbo processing, and pre- 
sumably more reliable feedback decisions than the slicer used 
in a space-cime DFE. Details on MAP processing wiU be given 
in Section IV-B. 

We fust provide a brief review of the space-time filtering 
theory. Based on the space-time DFE equivalent model de- 
scribed above, the NfMSE solution for the feedforward filters 
unconstrained length can be given using the 
results of [6] (see also [28]) 

where 

k 

»fc = 53£r-ifr + K (13) 

4)... ^ 

r*(/)^£rrR:i,£r;. (i5) 

in the above equations, we assume that there are a total of k 
signal sources and we use to indicate the channel vector 
[see (8)] of the desired signal. The remaining k - 1 signals 
are interferences. B(f) in (12) is the feedback filler of the 
space-time DFE, which, from our assumption of ^ = 1, is 
only "I up" longer than B\f). Fkif) is the signal-to-inter- 
ference-plus-noise power density ratio at frequency /. i.e.. 
the ^'matched filler" bound. B{f) can be determined through 
spectral factorization of 1 + Vkif)^ 

Equation (12) indicates that the optimum feedforward filter 
consists of a spatial/temporal filter ^^iiH\, which performs 
prewhiicning (R^i is the whitening filter of interference and 
noise) and matching to the desired channel, followed by a tem- 
poral filter (1 + B(f))l( \ Tk(f)), which is an anticausal 
post-whitening filter for suppressing precursor ISI. 

A filter length analysis of the optimum space^ime DFE de- 
scribed above is provided in [6]. We will first consider a fi- 
nite-length realization of the space-time DFE based on the re- 
sults presented there. We assume that the system has similar 
ISI situations as in EDGE and GSM. Namely, using the mul- 
tipath delay profiles specified for EDGE and GSM (see Ta- 
bles I and II). and assuming the same symbol rate of 270.833 
kbaud (7 = 3.692 /is) with Nyquist filtering (panial response 
signaling is used in EDGE and GSM), the ISI lasts up to five 
symbol periods for the hilly terrain (HT) profile in Table II. Ac- 
cording to the empirical filter length formulas in [6|. the feed- 
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GSM TYPrcAL Urban (TU) Channel Model 




Path Delay (Ms» 0.0 0.2 0,3 1.6 2.3 


5.0 


Path Power (dB) -3.0 0.0 -2.0 -^.0 -8.0 


-10.0 


TABL£ II 




GSM Hilly Terkain khx\ CH.^NNEL model 




Path Delay (|U) 0.0 0.2 0.4 0.6 15.0 


17.2 


Path Power (dB) 0.0 -2.0 -t.O -7.0 -6.0 


-12.0 



forward filter on each branch should have the following causal 
and anticausal lengths to achieve near-oprimum performance 

Lc^ATOV-lKpae/lO) 
LA^K^KN(pABim (16) 

where K is the channel memory. .V is used here to indicate 
the total number of signals, including the desired and interfer- 
ence signals, and pds >s the average SNR in decibels. In our 
case, = 5. and assume for example that the system has four 
transmit and four receive antennas (iV = 4) and the operating 
range of average SNR is around 5 dB (pdB = 5). The required 
filter length, including the center up. will be Lc + ^.4 + I ^ 
23.5. This is a highly impractical number, considering that foiu* 
such filters are required, one per each receive antenna. Fur< 
thermore, as mentioned earlier, the optimum feedforward filters 
should be implemented using fractionally-spaced tapped delay 
lines, if a r/2-spaced filters are used, die total number of taps 
will be doubled. Such a space-cime system with about 200 co- 
efficients would be nearly impossible to compute in any radio 
link design. 

Faced with such impracticality of an ideal signal processing 
amuigemem, we proceed to consider a suboptimum option. 
First, we will use symbol-spaced instead of fractionally-spaced 
feedforward filters, in order to avoid significant performance 
penalties, a channel estimation-based timing recovery algo- 
rithm described in [29] will be used to optimize the symbol 
timing and the decision delay of the center tap relative to the 
measured channel impulse response. In principle, such timing 
optimization also allows the DFE to use a feedforward filter 
with a shoner span than the chaiuiel memory while achieving 
a reasonable performance (29|. [30). After experimenting with 
a number of significantly reduced filter length options, we 
decided on the following suboptimum space-time equalizer 
structure. The feedforward filter on each branch has a total 
of nine symbol-spaced taps, which are positioned such that 
Lc = Z. = 4. The feedback filter has a length of 8. i.e.. 
Lq ^ 0 with the MAP processor memory ^ = 1 included 
(in order to completely cancel postcursor ISI. Lq must be at 
least as large as the channel memory plus the number of causal 
taps in the feedforward filter). The method in (29) is used to 
optimize the symbol liming and the decision delay of the center 
tap as described above. Direct matrix inversion is used to set 



all the filter coefficients in a standard NfMSE linear processing 
fashion [4], [26], [31], assuming perfect channel estimation. 

B. Turbo Processing 

The nirbo processing technique used: in this snidy is also 
based on a standard approach — the reader is referred to the 
rich literature (11H191. [2IW221, (32W35] for a thorough 
treatment of this subject. The space-time equalizer and thr 
decoder both performs SISO sequence estimation to comp" 
the a posteriori probability (APP) of the transmit dau symbo 
This sequence estimation is done using the Bahl-Cocke-i^- 
linek-Raviv (BCJR) forward/backward algorithm. In the 
following, we describe the basic principle of the iterative 
detection/decoding process. 

Using the BCJR algorithm, the MAP processor in the 
space-time equalizer with states (m is the signal constel- 
lation size. e.g.. m = 8 for 8-PSK, and p = 1 in our case) 
computes the APP P[cib|y| of the krh coded bit Ck based on 
the observation y. where y is the equalizer output sequence 
corresponding to ail the data symbols in a received block (see 
Fig. 2), and the a priori infonnation provided by the decoder 
(this is not available in the first **turbo** iteration). The logarithm 
X{ck) ^ loge(/'(cibiyj) of this APP can be regarded as the sum 
of two terms 

Hck)^X^Ck)'hX'{ck) (17) 

where X^(ck) = iog^{P(cjtl) ^ logarithm of the a prior in- 
formation provided by the decoder, and X^{ck ) is called the **ex- 
trinsic*' information. In each '*mibo** iteration, the space-time 
equalizer subtracts X''[Ck) from the newly computed value of 
A(cjb) to obtain the extrinsic information A^(cjt) [see Fig. 1(a) 
and (b)]. The entire sequence {A'(cfc)} is deinterleaved and for- 
warded to the decoder. 

Similarly, the decoder computes the log*APP 
f(cfc) = loge(P(cit|{A*(cfc)}]) based on the deinterieaved 
extrinsic information provided by the space-time equalizer, 
and subtract X^{ck) from it to obtain extrinsic information 
v^(ck)* The extrinsic information is then interleaved and 
forwarded to the equalizer as the new a priori information 
A'*(cfc) for the next *lurbo'* iteration. 

The interleaver considered in this study is a pseudo*random 
tnterleaver. i.e.. we generate a pseudo-random permuutton of 
numbers from 1 to where / is the block length, and then use 
this perTnu^^r ton as a fixed inierieaver. 

In comr ng the branch metric obtained from the equalizer 
output wit{< the branch metric obtained from the soft input 
provided by the decoder, the MAP processor in the space-time 
equalizer must compute the a priori information for each 8-PSK 
symbol Xfc from the three soft inputs (A''(c3fc), A''(c3a+,), 
A^(c3|..,.2)). We assume that this is done by way of summing 
the three soft inputs as if the three coded bits were transmitted 
from indep?r ;tnt sources (these soft inputs are actually not 
independent ncn conditioned on the observed waveform of ihe 
entire data bu-st). This is a suboptimal method; which is known 
to cause a "random modulation" performance degradation 
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effect in bit-interieaved coded modulation [36H38]. However, 
ihis effect can be overcome by iterative decoding [38]. which is 
implicit in our turbo space-time processing approach. 

As noted earlier, in LST-I. the space-time equalizer in each 
layer must provide immediate data decisions to be used for inter- 
ference cancellation. Since these decisions are not "protected" 
by coding, they arc prone to errors. In this study, we explore 
a soft decision technique to minimize the effect of decision er- 
rors. The optimum soft decision can be computed by averaging 
all the possible transmit symbols weighted by their APP*s [39] 

Xfc= ^ sP[xk^x\y\ (18) 

*€.V 

where X includes all the complex-valued 8-PSK constellation 
points. Since P\xk = x|y| can be obtained along with the com- 
putation of the APP P[cjb|y|. this soft decision approach can be 
implemented with neariy no additional cost in complexity. Sim- 
ilarly, we apply the same technique to compute soft decision 
outputs in LST*n. 

V. Performance RESULTS 
A. Performance Criteria and System Assumptions 

We now present performance results of the layered 
space-time concepts described so far. The performance mea- 
sure is the block-error rate (BLER) over Rayleigh fading. 
The results are obtained through Monte Carlo simulation. The 
BLER is averaged over up to 40000 chaimel realizations. Each 
block contains 400 information bits (before coding). 

In comparing the performance results to channel capacity, 
we follow the convention of a number of previous works (e.g., 
[9]. [23]) to compare the computed BLER with the "outage ca- 
pacity" [\\, i.e.. the probability that a specified bit rate is not 
supported by the channel capacity. This is a vague comparison, 
since the Shannon limit refers to the highest error-free bit rate 
possible for long encoded blocks but it does not specify how 
long the blocks should be. Nevertheless, such a comparison 
should still be meaningful as long as the block length and BLER 
are specified. This is similar to the way a bit-error rate of 10"*^ 
is commonly used as the "error free" reference for an additive 
white Gaussian noise (AWGN) channel. 

In order to assess the best performance achievable, we as- 
sume that the channel characteristics can be perfectly estimated 
at the receiver. Similarly, the choices of 1-D processing and 
coding techniques are important to deliver the best possible 
performance. We try to optimize these choices while keeping 
them as practical as possible. Except for the use of array 
processing and iterative MAP algorithms, all the radio link 
techniques assumed in this study arc "legacy-compatible" with 
the EDGE standard (note also that vast research interest in turbo 
coding has made simplified MAP algorithms available (2 1 j. 
[22 1 that arc not much more complex than the conventional 
Viierbi algorithm). None of these techniques are claimed 
to be optimum. Yet. our results indicate that near-capacity 
performance is achievable when combining them through the 
coded layered space-time architectures. 




Fig. 3. Perfonnance of btt-interleaved 8-PSK with rate- 1/3 convoluttonaJ and 
(uft» coding over U) AWGN channel, and <b) qiiasi*su(ic flac Rayleigh fading 
channels with .V receive diversity antennas. 

B, Choosing the Coding Scheme 

We consider a bit-interleaved coded moduiation scheme 
using 8-PSK with Gray mapping and rate- 1/3 coding. 
Square-n>oi Nyquist filtering with 30% roUoff is assumed at the 
transminer and receiver. Bit-interleaved coded modulation has 
been shown [36], (37] to outperform u^ditional treliis-coded 
modulation in fast fading channels (where time diversity can 
be exploited thnnigh sufficient interleaving) and it can be 
improved upon by considering a bener mapping technique 
that permits a large Euclidean distance without sacrificing the 
maximum Hamming distance of the baseline coding scheme 
[38|. In this paper, though, since quasi-static fading is our 
basic assumption, the code by itself must be able to withstand 
deep fades. In principle, any code that performs well in an 
AWGN channel is considered a good candidate — turbo codes 
are among the strongest candidates that come to mind. 

Fig. 3(a) and (b) provide a performance comparison between 
two rate- 1/3 coding schemes: one using a 64-state convolutional 
code with (octal) generators (Gi. G2. Gz) = (155. I17» 123) 
(the same code as proposed for EIXiE (20|) and the other using 
a turbo code with two identical l6-state recursive encoders sim- 
ilar to the scheme originally proposed by Berrou and Glavieux 
[12] (the results here assume generators (Gi. C*) ~ (23. 31). 
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which appeared to perform slightly better than other genera- 
tors we tested). Both schemes assume the use of bit-interleaved 
8-PSK with Gray mapping. The turbo coding scheme has an 
additional interleaver within the encoder, which uses another 
pseudo-randomly generated permutation. The receiver structure 
is consistent with what we have described so far. Note, however, 
that we assume a minimum number of fdter taps (only one feed- 
forward tap per branch and no feedback filter) whenever there 
is no delay spread assumed, although the MAP processor in the 
DDFSE is always used for iterative detection/decoding as de- 
scribed in Section I V-B. For the turbo coding scheme, "one iter- 
ation" means a full cycle of three processes: 1) MAP processing 
in DDFSE; 2) mrbo decoding by the first decoder, and 3) turbo 
decoding by the second decoder. 

Fig. 3(a) shows the performance of the two coding schemes in 
an AWGN channel. First, we note thai the performance of con- 
volutional coding also benefits from iterative processing. This 
is due to the suboptimai nature of the decoding scheme, i.e.. 
the **random modulation** effect described earlier, which can be 
improved through iterative decoding. Fig. 3(a) shows that most 
of the improvement is achieved within two decoding cycles. For 
turbo coding, the performance still improves even after five iter- 
ations, but saturates quickly after ten iterations. At 10'** BLER 
(approximately equivalent to 10"^ bit-error rate), turbo coding 
outperforms convolutional coding by about 2.2 dB, and the re- 
quired SNR is within only 2.4 dB of the 0-dB Shaimon limit for 
a spectral efficiency of I bps/Hz (8-PSK with rate- 1/3 coding). 

However, when we look at the average BLER performance 
over quasi^static flat fading channels in Fig. 3(b), the benefit 
of turbo coding (with ten iterations) over convolutional coding 
(with two iterations) is reduced to only about OJ dB at any value 
of the average SNR and for all the assumed numbers of receive 
diversity antennas. This is not surprising for two reasons: First, 
it is well known that the average BLER is determined mostly 
by the probability of fading events that results in high BLER*s. 
If we look at the relative performance at a BLER of, say. above 
1 0% in Fig. 3(a). the difference between the two coding schemes 
is indeed less than 1 dB. Second, the performance of convolu- 
tional coding over fading channels is already within about 2 dB 
of the capacity bound — the capacity bound in this case is de- 
fmed as the probability that the combined output SNR of all 
diversity branches is below the 0-dB Shannon limit. Thus, there 
is not much room for further improvement. 

Based on the fact that the performances of the two coding 
schemes are quite similar in quasi-static fading channels, we 
will only consider convolutional coding in the remainder of this 
paper. 

C. Layered Space-Time Performance 

We first look at the performance of the two coded layered 
space-time approaches over a flat Rayleigh fadmg channel. 
Fig. 4 shows the different capacity bounds for this channel, 
assuming .V = 2. 4. and Cit. where :V is the number of transmit 
and receive antennas. Again, although we plot the results as 
"block error rate." the capacity bound is defined as the proba- 
bility that the specified spectral efficiency R(R = .V in this 
case) is not supported by each of the differently defined channel 
capacities. C denotes the Shannon capacity bound given by (4» 
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Fig. 4. Capacity bounds for quasi-static flat Rayleigh fading channels with .V 
transmit and .V receive antennas. C: Shannon capacity. Cf : Fbschini (original) 
bound, and Ci.5T-tf: capacity bound for LST-U. 
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Fig. 3. Layered space-time peifoimance of LST-I with two transmit and two 
receive antennas (.V = 2). Quasi-static flat Rayleigh fading channel. 



[which is equivalent to the generalized Foschini bound Cp^ 
in (1 1)], Cr denotes the original Foschini bound (with the ZF 
constraint) in (3), and Clst-ii is the capacity bound for LST>n 
in (9) (however, the results for Clst-ii are obtained simply 
by averaging the probability that /2 = 1 is not supported by 
each processing layer). Note that Clst-ii can indeed provide 
approximately a diversity order of N: this is attributed largely 
to the use of layer ordering as discussed earlier Note also that 
all the bounds show an improvement with increasing N, This 
means that the capacity actually increases more than linearly 
with the number of transmit and receive antennas. However, 
there is a diminishing improvement as iV increases to a much 
larger number. 

Fig. 5 shows the simulation results for LST-I with 2 transmit 
and 2 receive antennas (i.e., N = 2). Three sets of results are 
provided, assuming: 1) soft decisions; 2) hard decisions: and 3) 
correct decisions in each layer (note that the DDFSE always uses 
tentative decisions and provides soft outputs to the decoder). 
We see that« although soft decisions offer some improvepnent 
over hard decisions, the impact of decision errors is still quite 
noticeable. Fig. 6 shows similar results for = 4. Here, the 
impact of decision errors is not as significant as the previous 
results, and turbo processing and soft decisions help to reduce 
much of this impact. With three iterations, the effect of decision 
errors almost completely disappears when using soft decisions. 
Decision errors have a lesser effect for a larger N because of the 
greater diversity order available through array processing and 
layer ordering. 

In Fig. 7. we compare the results using soft decisions and six 
"turbo" iterations with the Shannon capacity bound. For yv = 4 
and 8. the performance of LST-I is within 2.5-3 dB of the ca- 
pacity bound at 10% BLER (and about 3-3.5 dB at 1% BLER). 
Since the BLER may vary as a function of the block size,^ it 
is also important to consider the processing less by discounting 
the loss due to the inefficiency of modulation and coding. As 

an example, when we double the block size, the required average SNR 
U 0.2-0.4 dB greater than ihe results shown here. However, this difference in 
average SNR applies unifoimly to all results, with or without layered space-time 
processing. 
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Fig. 6. Layered space-4ime performance of LST-I with four transmit and four 
receive antennas (.V = 4). Quasi-static flat Rayleigh fading channel. 
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Fig. 7. Layered space-time performance of L5T-I for .V = 2. 4, and S. Soft 
decisions. 6 iterations. Quasi-sutic flat Rayleigh fading channel. 



shown in Ftg. 3(b). there is already a gap of about 2 dB between 
the performance of our coding scheme and the Shannon limit. 
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Fig. 8. Uveicd space-time perfofmutce of LST-II for -V = 2 . 4. and 8. Soft 
decisions. 2 iierauons. Quasi-sutic flat Rayleigh fading channel. 

Thus, the actual loss due to layered space-time processing is 
only about 0.5^1 dB at 10% BLER. 

Fig. 8 shows die performance of LST-II for ;V = 2. 4. and 
8, compared to the Shannon and Clst-ii bounds. Soft deci- 
sions and two •turbo" iterations are assumed. (In this case, we 
found the effect of decision errors to be marginal, i.e., the re- 
sults with hard and correct decisions were generally within I 
dB of die results shown here. Also, we found turbo processing 
with more than two iterations to provide little improvement.) 
At 10% BLER, the performance of LST-Q is 2.5 dB from the 
Clst-ci bound, and 3.5 dB from die Shannon bound. At 1% 
BLER. however, the loss compared to die Shannon bound can 
be as much as 6 dB. 

From die above results, we conclude diat. for iV = 4 and 
8, LST-I outperforms LST-O by a margin of OJ dB (at 10% 
BLER) to 3 dB (at 1% BLER). For N = 2, die pcrfoimancc of 
LST-I is greatly affected by decision errors (note dial, even in 
this case. LST-I still performs as well as LST-II ai 10% BLER), 
whereas LST-II can reach a lower BLER at high average SNR. 
Based on these results, die layered space-<ime approach is 
not highly recommended for /V = 2. As mentioned earlier, 
space-time coding is a better alternative to achieve a spectral 
efficiency of 2 bps/Hz. For instance, a 64-state space-time 
coded QPSK can perform to wiUiin 2 dB of the Shannon 
capacity bound [23]. 

D. Frequency-Selective Channels 

Finally, we present an example of performance results for fre- 
quency-selective fading channels. This example assumes ;V = 4 
and die use of soft decisions for both LST-I and LST-II. Fig. 9(a) 
and (b) show the results for the TU and HT protlics, defined in 
Tables I and II. Again, we only show results with two "lurbo" it- 
erations for LST-II because little improvement can be achieved 
with more iterations in this case. For both delay profdes. the per- 
formance at 10% BLER is within 3 dB of the Shannon bound 
for LST-l with six iterations, and within 4 dB for LST-II with 
two iterations. At a lower BLER. the loss relative to die bound 
is greater for HT dian for TU. This is due to the limitation of 
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frequency-selective channeb: (a) TU profile, (b) HT profile. .N = 4. Soft 



the subopiimum space-^ime equalizer structure we assume, as 
akeady discussed in Section IV-A. 



VL CONCLUSION 

By deriving the generalized Shannon capacity formula and 
suggesting a layered space-time architecture diat anains a tight 
lower bound on die capacity achievable. Foschini has laid a 
significant dieoreiical foundation for improving the wireless 
channel capacity through multiple-element array technology. 
We have shown that Foschini s lower bound is actually the 
inie Shannon bound when the output SNR of the space-time 
processing in each layer is represented by the corresponding 
• matched filief * bound. We then provided two coded layered 
space-time approaches as an embodiment of diis concepi. For 
a large number of transmit and receive antennas, coding across 
the layers provides a better performance than independent 
coding within each layer. However, with two transmit and two 
receive antennas, the former is heavily atfecicd by decision 
errors and. therefore, provides a poorer performance than the 
latter. 

The undcrlvine coding and signal processmg techniques used 
in this study are based on practical but subopiimal approaches. 
Yet, such suboptimality can be greatly compensated tor by it- 
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crative processing. Overall, our coded layered space-rime ap- 
proaches can achieve a performance within about 3 dB of the 
Shannon bound at 10% BLER* about 2 dB of which is a loss 
due to the practical coding scheme we assume. Thus, not only 
is the layered space-cimc architecture exactly what the Shannon 
limit has prescribed in a theoretical sense, but it also provides 
an attractive general methodology for improving and achieving 
the wireless charmel capacity. 

appendix 

Proving the Equivalence of Foschini Bound and 
Shannon Capaoty 

Using the mathematical induction method, we will prove that 
(4) and (1 1) art identical. In order to do so. we must show that 

s 

dct(«K-M = n<^ + ^*^^W 09) 

where 92 in the above equation is equivalem to Riv defined in 
(13). Again, we assume that the ibth layer has 1 interferences. 
Note also that the proof provided here is independent of the 
number of receive antennas M (the dimension of S) and the 
way the layers are ordered. 

We Stan by assuming I signal source and Af receive antetmas. 
It can be easily shown (1] that 

det(«iR-*) = I + Iff « I + TiC/). (20) 

Next, we assume that (19) Is true fbr the case of n - 1 signal 
sources, 

det(R«.iH-M = U^l + r*(/)). (21) 

kmi 

We then show in the following that, given (21). ( 19) is also true 
for n signal sources. 
Fitst we note from (13) that 



(22) 



Using the matrix inversion lemma [4, Appendix D]. we can 
show that [similar to (12)] 

It follows that 

(24) 



For convenience, let 

il^»„K-^andB = Rn-iK"'- 
We can rewrite (24) as 



i+r»(/) 



(25) 



(26) 



Funheimore. using the matrix identity [40] 

where A^i is called the adjugate matrix of matrix A, we can 
rewrite (26) as 

dct(A)"" det(B) l + rn(/)' 
By replacing det(B) in the above equation using (2 1 ). we obtain 

det(it) » 

kmi 

Our goal is to prove that 

det(i4) = + rt{/)). (30) 

Thus, given (29). we must show that 

A^^n = B^iHl. (31) 
from (22) and (23). we have 

where bj is the j'th column vector of for > = 1, * - , Af : 
for convenience, we use Af here to indicate the overall receive 
diversity order, including the effects of both nuiitiple antennas 
and excess bandwidth. 

We now prove (31) by showing that the jth elemem 
of A^d^H^ IS equal to die jth element of B^jH^^ for 
j = 1, iVf. Note that die jth element of A^di^ is given 
by det(7) )* where Ti^ is obtained by replacing the jth column 
of A by Simtlariy, die jth element of B^iH^ is ghren by 
det(?>). where 'Bj is obtained by replacing the jth column of 
BbyH\. 

Using (32) and the linear pnpeities of deteiminants. we can 
show diat 

Jth 

dut(3>) = det[6,+F;^i^. • . HI, •] 

(33) 

= det|6,. --. JT;. 1 

3tlct(6|. 1 

= .ict(6,. ■•.ff;. I. 

•16- 



Similarly, we can expand the result of (34) with respect to the 
second column, the third column, and so on (except for the jih 
column). Evennially. we obtain 



det{Aj} = det [6i, 62* • 



jih 
column 



..6Ml = det{B,) 
(35) 
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According to the present Invention, tine receiver can select a set of 
antenna elements, including their number and / or identity, from among a 
larger group of antenna elements in order, among other things, to improve 
perfomriance of the system without increasing the extent of radiofrequency 
circuitry. One process for selecting antenna elements is to utilize equation 4 
above as a measure of quality for the particular set of antenna elements being 
evaluated. That evaluation can occur for each permutation or combination of 
antenna elements in order to select the subset with optimum performance (as 
detemiined. for instance, by selecting the subset with greatest value 
calculated according to equation 4). This can occur at whatever desired 
points in time, including periodically. 
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CLAIMS 

What is claimed is: 

1 . A radio receiver coupled to a plurality of receiver antenna elements, 
comprising a plurality of layers for processing signals received by the receiver 
elements, each layer comprising: 

a space-time equalizer, the space-time equalizer in a first layer of the 
plurality of layers coupled to each of the receiver elements, the space-time 
equalizer in each of the other layers of the plurality of layers coupled to an 
interference canceller which receives output from the layer preceding each 
said other layer. 

2. A receiver according to claim 1 further comprising a deinterieaver 
coupled to each space time equalizer and to a decoder for output. 

3. A radio receiver according to claim 1 in which each layer comprises its 
own deinterieaver and decoder, and further comprises an interieaver adapted 
to receive output from the decoder in said layer and from the deinterieaver in 
said layer, the interieaver feeding output to the equalizer in said layer, thereby 
allowing soft decisions about information being processed iteratively by said 
layer to be fed back and forth between said equalizer and said decoder in said 
layer. 

4. A radio receiver according to claim 3 in which the interference canceller 
which receives output from the layer preceding each said other layer receives 
information from the decoder and the interference canceller in the preceding 
layer. 

5. A radio receiver according to claim 1 in which the equalizer in each 
layer is connected to a common deinterieaver. which feeds a common 
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decoder, and in which a common interleaver receives signals fror. the 
decoder, and is coupled to each equalizer in order to provide interleaved 
signals to the equalizer. 

6. A radio receiver according to claim 5 in which the interference canceller 
which receives output from the layer preceding each said other layer receives 
signals from the equalizer and from the interference canceller in said layer 
preceding each said other layer. 

7. A radio receiver according to claim 1 in which each space-time 
equalizer performs equalization using a minimum mean-square error criterion. 

8. A radio receiver according to claim 1 in which the receiver is adapted to 
receive and process signals from a transmitter that is coupled a plurality of 
transmit antenna elements, the number of transmit antenna elements to which 
the transmitter is coupled being less than the number of receiver antenna 
elements to which the receiver is coupled. 

9. A radio receiver according to claim 8 in which the receiver is adapted to 
receive and process signals from a transmitter that is coupled to N transmit 
antenna elements, the number of receiver antenna elements to which the 
receiver is coupled is M. and M is greater than N. 

10. A radio receiver according to claim 9 adapted to be coupled to at least 
one set of M receiver antenna elements out of K available receiver antenna 
elements. 

11. A radio receiver according to claim 1 in which the receiver is adapted to 
select the sequence in which information from the receiver antenna elements 
is to be processed. 



12. A communications system, comprising: 

a radio transmitter coupled to a stream of Information and to a plurality 
of transmit antenna elements, said transmitter adapted to apportion a portion 
of the stream of information to each transmit antenna element by interleaving 
said portions of the information stream among the transmit antenna elements; 
and 

a radio receiver coupled to a plurality of receiver antenna elements, 
comprising a plurality of layers for processing signals received by the receiver 
elements, each layer comprising: 

a space-time equalizer, the space-time equalizer in a first layer of the 
plurality of layers coupled to each of the receiver elements, the space-time 
equalizer in each of the other layers of the plurality of layers coupled to an 
interference canceller which receives output from the layer preceding each 
said other layer. 



1 3. A system according to claim 1 2 further comprising a deinterieaver 
coupled to each space time equalizer and to a decoder for output. 

14. A system according to claim 12 in which each layer comprises its own 
deinterieaver and decoder, and further comprises an interieaver adapted to 
receive output from the decoder in said layer and from the deinterieaver in 
said layer, the interieaver feeding output to the equalizer in said layer, thereby 
allowing soft decisions about information being processed iteratively by said 
layer to be fed back and forth between said equalizer and said decoder in said 
layer. 



1 5. A system according to claim 1 4 In which the interference canceller 
which receives output from the layer preceding each said other layer receives 
Infonmation from the decoder and the interference canceller in the preceding 
layer. 
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1 6. A system according to claim 1 2 in which the equalizer in each layer is 
connected to a common deinterieaver. which feeds a common decoder, and 
in which a common interleaver receives signals from the deinterieaver and 
decoder, and is coupled to each equalizer in order to provide interleaved 
signals to the equalizer. 

1 7. A system according to claim 16 in which the interference canceller 
which receives output from the layer preceding each said other layer receives 
signals from the equalizer and from the interference canceller in said layer 
preceding each said other layer. 

18. A system according to claim 12 in which each space-time equalizer 
performs equalization using a minimum mean-square en-or criterion. 

19. A system according to claim 1 2 in which the number of transmit 
antenna elements to which the transmitter is coupled is less than the number 
of receiver antenna elements to which the receiver is coupled. 

20. A system according to claim 12 in which the transmitter is coupled to N 
transmit antenna elements, the receiver is coupled to M antenna elements, 
and M is greater than N. 

21 . A system according to claim 20 in which the receiver Is adapted to be 
coupled to at least one set of M receiver antenna elements out of K available 
receiver antenna elements. 

22. A system according to claim 12 in which the receiver is adapted to 
select the sequence in which information from the receiver antenna elements 
is to be processed. 

23. A communications system, comprising: 
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a radio transmitter coupled to a stream of information and to a plurality 
of transmit antenna elements, said transmitter adapted to apportion a portion 
of the stream of information to each transmit antenna element by interleaving 
said portions of the information stream among the transmit antenna elements; 
and 

a radio receiver coupled to a plurality of receiver antenna elements, 
comprising a plurality of layers for processing signals received by the receiver 
elements, each layer comprising: 

a space-time equalizer, a deinterleaver and a decoder, the space-time 
equalizer in a first layer of the plurality of layers coupled to each of the 
receiver elements, the space-time equalizer in each of the other layers of the 
plurality of layers coupled to each of the receiver elements and to an 
interference canceller which receives output from the decoder in the layer 
preceding each said other layer, each space-time equalizer coupled to the 
deinterleaver in its layer, each deinterleaver coupled to the decoder in its 
layer 

the equalizer in each layer adapted to perform minimum mean-square 
error equalization to signals being processed; and 

an output for said stream of information coupled to the decoders in 
each of said layers. 

24. A system according to claim 23 in which the transmitter is adapted to 
interleave portions of the information stream among the transmit antenna 
elements randomly. 

25. A system according to claim 23 in which the transmitter is adapted to 
interleave portions of the infonmation stream among the transmit antenna 
elements pseudo-randomly. 
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26. A system according to claim 23 in which the number of transmit 
antenna elements to which the transmitter is coupled is less than the number 
of receiver antenna elements to which the receiver is coupled. 

27. A system according to claim 26 in which the transmitter is coupled to N 
transmit antenna elements, the receiver is coupled to M antenna elements, 
and M is greater than N. 

28. A system according to claim 27 in which the receiver is adapted to be 
coupled to at least one set of M receiver antenna elements out of K available 
receiver antenna elements. 

29. A system according to claim 23 in which the receiver is adapted to 
select the sequence in which information from the receiver antenna elements 
is to be processed. 

30. A process for communicating an information stream using a radio 
transmitter and a radio receiver, including: 

a. coupling a radio transmitter to the information stream and to a 
plurality of transmit antenna elements, and interleaving portions of the 
infomnation stream among the transmit antenna elements; 

b. transmitting said portions of the information stream; 

c. coupling a radio receiver to a plurality of receiver antenna 
elements to receive the transmitted infomiation stream; and 

d. processing the information stream in a plurality of processing 
layers, comprising: 

(i) in a first layer, coupling the receiver antenna elements to 
an equalizer and space-time processing the signals from the receiver antenna 
elements in the equalizer; 

deinterleaving the output from the equalizer 
decoding the deinterleaved output from the equalizer; 
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feeding the decoded information to a common output for 
the information stream; and 

(ii) in each successive layer, coupling to an equalizer the 
output from an interference canceller that is fed by output from the preceding 
layer and space-time equalizing said output from said interference canceller; 

deinterleaving the output from the equalizer; 

decoding the deinterleaved output from the equalizer; 

and feeding the decoded information to a common output 
for the information stream. 

31 . A process according to claim 30 in which steps of deinterleaving the 
output from the equalizer, decoding the deinterleaved output from the 
equalizer and feeding the decoded information to a common output for the 
information stream are performed inside each layer and further comprising, in 
each layer, reinterleaving deinterleaved and decoded output from said layer, 
and feeding the reinterleaved signal to the equalizer in said layer thereby 
allowing soft decisions about information being processed iteratively by said 
layer to be fed back and forth between said equalizer and said decoder in said 
layer. 

32. A process according to claim 30 in which the equalizer in each layer 
performs minimum mean-square error equalization. 

33. A process according to claim 30 in which said interleaving is performed 
randomly. 

34. A process according to claim 30 in which said interleaving is performed 
pseudo-randomly. 
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35. A process according to claim 30 in which coupling of said receiver to 
said receiver antenna elements includes selecting a set of receiver antenna 
elements from a larger group of receiver antenna elements. 

36. A process according to claim 30 in which coupling of said receiver to 
said receiver antenna elements includes coupling to a set of receiver antenna 
elements selected from a larger group of receiver antenna elements. 

37. A process according to claim 36 in which said coupling further includes 
selecting the sequence in which signals from receiver antenna elements are 
to be processed. 
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Fig. 2. Space-time DDFSE with MAP processing 
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Fig. 3. Peifonnance of bit-interieaved 8PSK with rate- 1/3 convolutional 
and turbo coding over (a) AWGN channel, and (b) qxiasi-static flat Rayleigh 
fading channels with N receive diversity antennas. 
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Fig. 4. Capacity bounds for quasi-static flat Rayleigh fading channels with 
transmit and ^receive antennas. C : Shannon capacity, Cp : Foschini (original) 
bound, and Clsj,„ : capacity bound for LST-II. 
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Fig. 5. Layered space-time performance of LST-I with 2 transmit and 
2 receive antennas (A^= 2). Quasi-static flat Rayleigh feding channel. 
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Fig. 6. Layered space-time performance of LST-I with 4 transmit and 
4 receive antennas 4). Quasi static flat Rayleigh fading channel. 
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Fig. 7. Layered space-time performance of LST-I for N^2,4, and 8. 
Soft decisions. 6 iterations. Quasi*static flat Rayleigh fading channel. 
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Fig. 8. Layered space-lime performance of LST-II for N = 2, 4. and 8. 
Soft decisions. 2 iterations. Quasi-static flat Rayleigh fading channel 
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Fig. 9. Layered space-time performance of LST>I over frequency-selective 
channels: (a) TU profile, (b) HT profile. ^ = 4. Soft decisions. 
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